MOMENT CLOSURE IN A MORAN MODEL WITH RECOMBINATION 



ELLEN BAAKE AND THIEMO HUSTEDT 

Abstract. We extend the Moran model with single-crossover recombination to include general 
recombination and mutation. We show that, in the case without resampling, the expectations 
C " 3 ' of products of marginal processes defined via partitions of sites form a closed hierarchy, which is 

04 ' exhaustively described by a finite system of differential equations. One thus has the exceptional 

situation of moment closure in a nonlinear system. Surprisingly, this property is lost when 
^ 1 resampling (i.e., genetic drift) is included. 



1. Introduction 

In recent years, the processes of population genetics, which describe the genetic structure of 
populations under the influence of evolutionary forces such as mutation, selection, recombination, 
(~| ' migration, and genetic drift, have been a rich source of fascinating probabilistic problems. More 

precisely, the dynamics is often well understood in the limit of infinite population size, where a 
law of large numbers leads to a deterministic description (in terms of discrete dynamical systems 
or differential equations), but great challenges ensue if the population is finite, in particular if 
there is interaction between individuals, such as competition (selection) or recombination (the 
combination of genetic material of two parents into the 'mixed' genetic type of an offspring); see 
[6l[9l[Tl]. Interactions usually make the infinite-population model nonlinear and, often, already 
difficult enough to treat. In the corresponding stochastic model, they are reflected by transition 
' rates (or probabilities) that depend nonlinearly on the current state of the system and often result 

. in processes whose treatment provides enormous challenges. Even the relationship between the 

stochastic process and its deterministic counterpart is usually unclear (apart from the infinite 
f"^ . population limit). In particular, the expectation of the stochastic process is, usually, not given by 

the corresponding deterministic dynamics - in general, such coincidence is reserved for populations 
of individuals that evolve independently (as in branching processes) ; or systems with interactions 
that do not change the expectation (like Wright-Fisher sampling). 

Indeed, even the analysis of the expectation is difficult in most processes of population genetics 
with interaction. Its dynamics does, usually, not only depend on the current expectation, but 



> 



X 



, on higher moments, whose change, in turn, depends on even higher moments. Formulating this 

hierarchy of dependencies is a common approach for stochastic processes arising in various appli- 
cations in physics, chemistry, and biology [15 [ I14 [ [8]. Usually, this hierarchy continues indefinitely 
(it does not 'close'); to extract at least an approximation to the (lower) moments of interest, some 
method of 'moment closure' must be employed (in the simplest truncation) [5]. 

The corresponding deterministic systems (that arise through a law of large numbers) are also 
often tackled via systems of moments or cumulants, see [6l Ch. V.4] for an overview. Models of 
recombination take a special role between linear and nonlinear models. Although there is abundant 
interaction and hence nonlinearity, the deterministic system that describes the frequencies of all 
possible (geno)types may be (exactly) transformed into a linear one by embedding it into a higher- 
dimensional space (more explicitly, by adding further components that correspond to products of 
type frequencies). This method is known as Haldane linearisation [16j . The underlying linear 
structure even allows a diagonalisation and explicit solution, see [18] and references therein. In 
certain important special cases (notably, in so-called single-crossover dynamics in continuous time) , 
this solution is surprisingly simple and immediately plausible [3 [3] . 
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Figure 1. Recombination 
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Figure 2. Recombination event defined by the set G (circled sites) 



Elucidating underlying linear structures in the corresponding stochastic system (more precisely, 
in the Moran model with recombination) has only started very recently. In the aforementioned 
single-crossover case, Bobrowski et al. [5] analysed the asymptotic behaviour in the presence of 
mutation. Baake and Herms [4] observed that the expected type frequencies in the finite system 
(but without genetic drift) follow those in the deterministic model; this could be explained by 
the (conditional) independence of certain marginalised processes that appear as 'subsystems' of 
the stochastic model. This and other results now lead to the question whether in the general 
recombination scheme (i.e., not restricted to single crossovers) the dynamics of the expectations 
may be embedded into a higher but finite dimensional space, such that they are given by a finite 
system of differential equations? Is there an equivalent of Haldane linearisation in the sense of 
moments? 

This article will address these questions in the framework of the Moran model with recombina- 
tion and mutation. In particular it will show that the system of moments closes here after a finite 
number of steps, without any need for approximations, as long as there is no genetic drift. This 
may be considered as a stochastic analogue of Haldane linearisation. 

2. Moran model with recombination 

We consider a population of individuals. Each of them is endowed with the set 5 = {1, . . . , n} 
of sites. These can be interpreted as nucleotide positions in a string of DNA or as gene loci on a 
chromosome. For each site i there is a finite set Xi of alleles that may occur at site i. A string of 
alleles is then called a type, A" := X i^i^i is the type space. 

We are interested in modelling recombination, which means the rearrangement of genetic ma- 
terial in sexually reproducing populations. It may occur during meiosis, the creation of gametes, 
that is egg cells or sperm. Homologous chromosomes may cross over at some points and exchange 
the genetic material in between (see Figure [IJ . 

In the following we will assign recombination events to subsets of sites in a natural way. Let 
G C S. Then the corresponding recombination event between two individuals is the following: the 
alleles at the sites given by G remain at their positions, whereas the alleles at the sites in G, the 
complement of G, are exchanged (see Figure [2]). 
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Figure 3. Moran model with recombination. At time the second individual, 
which is of type x, undergoes a recombination event corresponding to G and 
chooses its partner randomly, here the fourth individual, which is of type y; from 
that time on, the individuals are of type Pq{x, y) and Pq{x, y). 



We define the mappings Pq : X y. X ^ X , G ^ S hy 

(1) Pci^^y) = ■■ (X^gcla;,}) X (X^ggly,}) : , 

where : • • • : means that the coordinates are ordered as in X. So, pQ{x,y) and pQ{x,y) are the 
new types resulting from the recombination event corresponding to G between the types x and 
y. Obviously, Pq{x, y) = p^iy, x). So, G and G essentially correspond to the same recombination 
event. 

We now define a Moran model with recombination and mutation. Each individual undergoes 
recombination events corresponding to G C 5 at rate gQ/4:> for all G C S. The recombination 
partner is chosen out of the whole population (including the opening individual itself). Then they 
exchange their genetic material according to the recombination event corresponding to G (see 
Figure [3|) . To keep things well-defined, the recombination rates Qq have the properties Qq = Qq 
and ^ Qg ^ 0. 

Furthermore, mutation events may occur. An allele x^ € Xi at site i mutates into allele G Xi 
with rate /^^.y. > 0. Thus, the mutation rate depends on both the parental and the offspring 
allele. 

Additionally, we introduce birth events or, more precisely, resampling. Each individual pro- 
duces an offspring at rate 6/2 > 0. The offspring inherits the parent's type and replaces another 
individual, randomly chosen from the entire population (again including the parent individual). 

In the following we are interested in the composition of the population, so we define the sto- 
chastic process {Zt)^^^ with state space E := {oj counting measure on X with u){X) = N}, by 

Zt{{x}) := number of individuals of type x. 

In the following we will use shorthands like Zt{x), z{x) instead of Zt{{x}), z{{x}). Recombination, 
mutation and resampling events induce the following transitions if Zt ~ z: 

Z + V(J .^ y with V(. .^ y := -5y+ S^^^^ y^ + Sp^^^ y) 

(2) 1 

at rate — QQz{x)z{y) for a;, y G X, G G S, 



(3) 



^-^^-'5(:r,,...,a;.,...,x„) +'^(:ci,...,y,,...,x„) at rate iil^y^z{x), 
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(4) z z + — Sy at rate ;^z(a;)z(?/). 



b 

2N' 

The rate in ([2|) is determined in the fohowing way: An individual of type x recombines at rate 
^Qqz{x) and chooses one individual of type y with probability This leads to the rate 

QQz{x)z{y) which needs to be multiplied by 4 to account for the fact that the recombina- 
tion could be initiated by an individual of type y and that recombination according to G is the 
same as recombination according to G. 

A brief comment on the model is in order. We consider recombination and reproduction as 
independent events whereas, in true biology, recombination is coupled to reproduction. We use 
the decoupled version here because it is simpler, and because it allows to clearly separate the 
effects of random recombination from those of random reproduction. This version is also used 
elsewhere 1171 , on the argument that recombination events are rare. 

For a subset G C S we define Xq ■= X i^^Q^i a-^d the mapping ttq : X Xq as the canonical 
projection. Let w be a (signed) measure on X. We define the pullback ttq. by ttq.uj := uj o tTq^. 
So, tTq. maps a measure on X onto its corresponding marginal measure on Xq. 

In the following, marginal processes of Zt will play a crucial role. The following proposition 
states that these are Markov chains, too. It is an extension of Lemma 1 in [4]. 

Proposition 1. Let I C S, and let (Zt)j>g be the recombination process as defined by equations 
©-(HI). Then {Trj.Zt)f.^f^ is a Markov process with state space Ej {w counting measure on Xj, 
i^iXi) = N}. 

Proof. Obviously, {T:j.Zt)t>o is a stochastic process on Ei. 

We must show that the transition rates of {■Kj.Zt)t>o only depend on the current state of the 
process. A recombination event induces the following transition: 

Tl^.Z ^Tll.{z + Va^^^y), 

with 

(5) '"'l-^G,x,y = ^Trj{p^{x,y)) + j(pg(x,y)) " ^t^j(x) " ^7r^(a) 

and tti[pg{x, y)) = : ( X .^anA^i^) ^ ( X zsGn/iy^}) • '^^ ^^^^ ^i^^ O- 

Consider now any nonzero jump. If it comes from a recombination event, it must be of the 
form ([S]). That means there are types Xj,yj G Xj and a subset H of I such that {t:j.z)(xj) 
and {'Kj.z){yj) both decrease by one and the frequencies of the marginal types arising in the 
recombination event corresponding to H increase. The rate for this transition is then given by the 
sum of all transitions of the original process that induce this transition in the marginal process: 

E E E f-(-)-(y)- E f (2//) 

GCS: xeX: yeX: GCS: 

(6) Gni=H^^{x)=xj^,(y)=y^ Gni=H 



-^{nj.z){xj) ■ {TTj.z){yj), 



with g^^^ := ^ Qq. So, this last term depends only on the current state of the marginal 

GCS:GnI=H 

process {■Kj.Zt)t>Q. 

A mutation event of an individual of type x at site i from allele x^ to allele y^ induces the 
following transition of -Kj.Zt: 

This jump is zero if i ^ /. Obviously, the transition rate is /i^^.j,. i'^i-z)iT^iix)) and depends merely 
on the current state of iTj.Zt, too. The case of resampling is treated analogously. □ 

This proof is an example for the so-called lumping procedure for Markov chains, compare [71112) 
for the general context or [2^ for the sequence context considered here. 
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Remark 1. A comparison between ([2]) and (jH]) shows that the marginal process {'Kj.Zt)t>o can 
itself be considered as a recombination process on the sites /. So, assertions about Zt will also 
hold for all derived marginal processes. 

3. Recombination alone 

In this Section we restrict ourselves to the case without mutation and resampling^ that means 
with /x^^j^. = 6 = for alH G S" and x^,y^ £ Xi. 

Since Vg x y ^ ^ some x.y G X, there are 'empty' recombination events at positive rate, 
but including these redundancies makes the rates in ^ so simple. The rates become considerably 
more complicated if only 'true jumps' are considered. This is already visible in the projection 
onto a single type. Let x £ X and Zt = z. In order to figure out the rate for the transition 
z{x) — >■ z{x) + 1, we first determine the set of all pairs of types x,y € X such that for a given 
G C S the jump Vq ~ ^{x) equals 1: 

{{x,y} C X : v^^^ yix) = 1} = C X : ttq{S:) = t: (.{x) , t: ^{y) = ttq{x),x^ x,y ^ x} 

= {{i, y} C X : i e t:^^ {T^aix)) \ {x}, y e tt^^ (TTgla;)) \ {x}}. 

This leads to the transition rate 

(7) H (^g(^)) - <x)] [M (^G - • 

GCS 

The transition rate for z{x) z{x) — l can be figured out analogously; Vqj. -{x) = — 1 iff i = x and 
y is any type which is neither in ttq^ (ttq (a;)) nor in tt^^ (7rg(a;)) . Since tt^^ (77,^(2;)) Htt^^ (77^(2;)) = 
{x} one has the rate 

(8) Y. Il^(^) - (^G-^) (t^g - (^G-^) (^g(^)) + ^(^)] • 

GcS 

Our aim is now to reformulate the process with the help of additional random variables, so that 
the transition rates become simpler, in particular, unaffected by empty events. To this end, we 
define two new counting measures derived from (Zt)^^^, namely (C/t)j>Q by Uq{x) = and 

Ut{x) = number of events at which x-individuals are created until time t 

and (Vt)i>Q by Vo{x) = and 

Vt{x) — number of events at which a;-individuals are broken up until time t. 

These processes also count events at which Zt does not change, namely the case that individuals 
of type X are created and broken up at the same time. This may happen when an individual of 
type X recombines according to G with an individual of type y with 71^.(2/) — t:q{x). Whenever 
this occurs, both counters increase but their difference remains unchanged. Altogether, we thus 
have 

(9) Zt = Zo + Ut- Vt 

with the transition rates of Ut and Vt unaffected by 'empty' events: For Ut{x) — u, u u + 1 
happens at rate 

If (^g-2)('^gW) • (^g-^)('^g(^))' 

GcS 

and for Vt{x) = u, the transition u — > v + 1 happens at rate 

GCS 

In the following, marginal processes will emerge frequently. We introduce a short-hand, symbolic 
notation similar to the one described in [T]. Fix an arbitrary x £ X and define for a subset 
G = {Qi, ■ ■ ■ ,ff|G|} of sites 

[G]t := [gi,...,5|G|]t — {T^G-Zt){7^Gix))- 
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[G]t is the number of individuals that are identical to x at the sites corresponding to G, at time 
t. Again, we use shorthands [g^^, . . . ,5|(3|]t instead of [{gi, . . . ,g^Q^}]t- Note that we suppress the 
dependence on x in [G]t for ease of notation. Analogously, we define for the processes Ut and Vt: 

{G)t {na.Ut){7Ta{x)), 



By Remark [1] we can now consider [G]t as a recombination process on [GI sites evaluated at 



the type {x 



(10) 



For |G| = 2, the distribution of {gi,g2)t can be given explicitly because the transition rates 

Qh 

HcS:\HnG\ = l 



EQ H r 1 r 1 



=: a 



are constant in time because all 1-site marginals are constant in time. So, (31,32)* follows a 
Poisson distribution with parameter at. 

3.1. Analysis of the expectation. Since we will use it frequently, we want to recall an elemen- 
tary fact concerning the dynamics of the mean of a continuous-time Markov chain with a finite 
state space, which is often used implicitly. The proof is a straightforward exercise that can be 
found in [H Fact 1], for example. 

Lemma 1. Let {Zt)t>o be a Markov process with finite state space E C Z'^ with transition rates 
q{z, z + v) for transitions from z to z + v ior z E E, v (let q{z, z + v) = 0if2: + t;^ E). Then 
the following equation holds for alH > 



where F is defined as 



:= ^ vq{z,z + v). 



□ 



Lemma [T] together with the representation of Zt in ^ gives us the dynamics of the mean: 



(11) 



dt 



[1,- 



Y.¥.[^{\G]t[G]t-N ■[!,..., n]t) 



GcS 



The motivation for this comes from the well- understood special case of single crossovers [1]. 
Here, all recombination rates that are attached to multiple crossover recombination events vanish. 
This affects all Qq with G that either do not contain 1 or n, or have gaps. 

In this case, the induced marginal processes are conditionally independent of each other and so 
moment closure is immediate [4i Lemma 1 and Theorem 1]: 



y ^(1 



GcS 



GCS 

[G]*]E[[G]t 



iV-E 



N ■[!,..., n]t) 
.,n\t ). 



We obtain a finite nonlinear system of differential equations, whose solution is known in closed 
form T. 

The independence relies on two properties. First, a single crossover recombination event induces 
a pair of marginal processes [G]f, [G]t for which {G, G} is an ordered partition of S. Second, a 
single crossover recombination event only affects one of the induced processes while leaving the 
other one constant. 

With general recombination both these properties are violated. First, marginal processes arise 
that are given by non-ordered partitions, so even single-crossover recombination events may affect 
both processes at the same instant. Second, a multiple-crossover recombination event may affect 
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the frequency of a pair of marginals that are given by an ordered partition. So, the independence 
of the induced marginal processes is violated in two ways. 

Let us now look at pT|) again. On the right-hand side, an expectation of products emerges. 
This is what one may expect due to the inherent nonlinearity of the recombination process. Nev- 
ertheless, we see that no site arises more than once, so the arising products are described by a 
partition of sites. This leads us to the following question: Given an arbitrary partition of sites, 
what is the dynamics of the mean of the product of the induced marginal processes? Theorem [1] 
below answers this. For its formulation we need the following definition. 

Definition 1. Let {Aj}j^j be a collection of sets with Ai O Aj — 0, i ^ j. Define Aj := IJ Aj. 

Then, G <Z Aj disrupts denoted by G\{Aj}j(zj, if G n Aj ^ {0,G} for all j e J. For 

|J| = 1, we simply write G\Aj. 

Note that for a collection of pairwise disjoint subsets of sites {Aj}j^j disrupted by G, in a re- 
combination event corresponding to G between individuals of marginal types t^q{x) and n^^^^^^x), 
the processes {Aj)t, j J increase. Similarly, in the recombination event corresponding to G be- 
tween individuals of marginal types 7r^^(a;) and jXAk^^"^ K C J, the processes {Aj)t, j & J 
increase. 

With these preparations, we are now ready to state 

Theorem 1. Let m < n, AI :~ {1, . . . , m} and A {Ai, . . . , A„i} be a partition of {1, . . . ,n}. 
Define V as the set of all triples (/, J,K), where {/, J,K} is a partition of AI . Then 

J 



d 

(12) dt 



eeM {i,J,K)eViei k<ik ggAj 



where G'^ is the complement of G in Aj and q is defined as 
(13) ska := E E 



Qhudug- 

DcAi HgAk 

H\{Ak}keK 

Remark 2. The right-hand side of may be read in the following way. The set / indicates the 
parts of A that remain unchanged under the corresponding recombination event, the sets J and 
K indicate sets for which the derived processes Ut and Vt, respectively, increase. So the splitting 
of Zf into Ut and Vt does not only simplify the calculation but also shows up in the result. 

Proof of Theorem]^ For 5t > 0, define 

{AiYst - {At)t+st - 



and 
then 

and n^gMl^^lt+^t ^"2^ds 



Wit ■■= {Ae)t+st - (Ai) 



[Ae]t+5t = [Af]t + {Ae)lt ~ (A,) 



leM (i,j,K)ev iei jeJ keK 

Let t + St he the time of the first recombination event after time t. 
Then, a summand O^G/i^^]* njej(^j)5t I\k&KiAk)lt may evaluate to: 

• zero if there is any j € J oi k £ K such that {AjYg^ = or {Ak)lf = 

• (-1)'''^' riiG/I"^*]* otherwise, that means if {Aj)g^ = {Ak)lt = 1 for all j e J, k e K. 
The latter transition comes from recombination events that correspond to the union of some G 
disrupting {Aj}j£j and H disrupting {Ak}keK and any subset D of Aj. At such recombination 
events, the recombining individuals must be of the following form: x-alleles at G, G'^ resp., x-alleles 
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at j4j,, k G K, whereas the particular A^. may be arbitrarily distributed across the two individuals 
(but the individual sets may not be disrupted!). Thus, the complete rate reads 

(14) riI,.lK):^Y. E ^i^K ^ GUA^^^ U G% 

KCK GgAj 

G|{A,be., 

with G"^ and q as defined above. This is the rate of the event that the terms corresponding to 
J and K increase, that means it is the rate of all recombination events such that a binding arises 
in each Aj, j G J, and a binding breaks in each Ak, k G K. 
Thus, 

|E[n[^dt]-E[ {-1)^''^X{[A^]AI,J,K) 
ieM {i,j.K)ev iei 

I=iM 



which is the assertion of the theorem. 



□ 



Let us now consider the implication of the theorem for the moment-closure problem. The 
theorem tells us that the dynamics of the mean of a product of marginal processes defined by a 
partition of sites can be described by the mean of another product of marginal processes defined 
by a (finer) partition of sites. Since the number of sites is finite and so is the number of partitions 
of sites the moment closure approach (for the mean) directly leads to a finite and linear system of 
ode's. We have thus proved 

Corollary 1. For the Moran model with recombination alone, the moment approach closes. □ 

The size of these systems explodes with the number of sites. Nevertheless, there is much 
redundancy in the concrete calculation of particular means. For example, in the analysis of 
E[[1, 2, 3, 4]t] , marginal processes on three sites emerge. According to Remark [U these can be 
treated as recombination processes on three sites, so by a proper summation of the recombination 
rates, one can easily determine their solutions given the solution of the three-sites recombination 
process. 

3.2. Comparison with the deterministic dynamics. We now want to compare the result of 
Theorem [1] to the corresponding deterministic dynamics. To this end, let J^{X) be the space of 
all measures on X. For G C S define the recombinator Rq by 

\UJ\ 

with Rg{0) = 0. Consider the following dynamical system on A4{X): 
(15) ^= ^ 

GCS 

This is the infinite population limit of the recombination process (without and with resampling) 

1 - 

N' 



in the following sense. If we consider := jrZt and let limjv-i-oo — pg, then 



(16) lim suplZf -p,HO 

with probability 1, where Pg is the solution of the initial value problem (jlSp with ojq ~ Pq. This 
is shown in 4 for the special case of single crossovers, but it is obvious that the proof, which is 
based on the general law of large numbers by Ethier and Kurtz ([TUl Thm. 11.2.1], see also [13]), 
may be generalised to the case of multiple crossovers. 



"'^This is a generalisation of the recombinator in [T]. Note that the notational similarity is deceptive because G 
denotes sites here rather than 'links' (the bonds between sites) as in [l]. 
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We are now interested in the relationship between (jl2p and the deterministic dynamics. If ujt 
follows (jl5l) . then a (tensor) product of marginal measures (Tr^^.cjt) ® ■ ■ ■ ® (Tr^^.wt) given by a 
partition of sites as in Theorem [T] exhibits the following dynamics: 

(17) 

j{{TTA^.^t) ® • • • ® (^A„-Wt)) = {^aAYI ^(^G - l))(wt)) ® {T^A,-^t) (^A„-Wt) 

GCS 

+ {^A^ -^t) ® {^aAYI ^ - l))wt) ® (7r^3 .Wt) «> • • • 8) (7r^,„ -^t) 

GCS 

+ ■■■ + i^A, -^t) ® • • • ® (7r^,„_, .wt) ® (^A„ ■( I] ^ (^G - l))wO 

GCS 



EE 

i=l SCA. 



£<B(7r^l .Wt) [-^(TTg.Wt) (tT . y^.Wt) - (tT^ . .Wt)] (g) • • • ® (tT^^ .Wt), 

\Ult\ 



with := X] 9h- 



Has 

HnAi=B 



Compare this to (fT2|l . and only consider summands where |J| = 1 and \K\ = or |J| = 
and |i^| — 1. According to Remark O we can understand the corresponding transitions as 
'uncorrelated' events at which only one marginal process changes at a given instant. We get the 
following terms on the right-hand side of ([T2|): 



. J^{j},K = 0: 

(18) ie[ n [^^]* E \ 



GcAj 
G\Ai 



J = 0,K = {j}: 



(19) e[1[[A,],^^{~1)[A,],N 



with (cf. (ng) 

= E E ^HuD = E E ^HuD = E ^izji^^ ■ 

DcAi HcAj HcAj DcAi HdAj 

H\Aj H\Aj H\Aj 

Adding all terms of kind (IT51) and one obtains the analogue of the right-hand side of ((T7)) . 
According to Remark[21 the summands with | JU-fCl > 2 correspond to 'correlated' events at which 
two or more marginal processes change simultaneously. 

We may thus conclude that the uncorrelated events correspond to the deterministic equation. 
We will now show that the correlated events are of lower order and thus tend to zero in the 
limit N — >■ oo. To this end, look at (fTT|) . The right-hand side consists of the terms ■^[G]t[G']4 
and [!,...,«](. They are both of order N , since each individual term [. . . ]t is of order N (which 
follows, for example, from (ITBl) ). Let us look at the derivative of the mean of ■^[G']t[G']t (cf. (IT^ 'l. 
The terms with |/| — 1 (those belonging to 'uncorrelated' events) will be of order N again, whereas 

the terms with 1 = are of order 1. By differentiating terms such as E ;|^[v4i]t[^2]t[^3]t and 
beyond, the same observation applies: the order of summands belonging to 'correlated' events is 
less or equal 1, so for the relative frequencies ([1, ■ • . ,n]t/N) the dynamics of the mean tends to 
the dynamics of the deterministic model. 

3.3. Two sites, arbitrary moments. In the case of two sites, the recombination process is 
rather simple. This mainly relies on the fact that the transition rate for (1, 2)t is constant, as we 
have already seen in ([TOl) . Furthermore the set of partitions of two sites is trivial. In this special 
case we can easily show moment closure for arbitrary moments. The simplicity of the setting 
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permits to look at [l,2]f itself without considering (l,2)t and (1,2)^. The process [1,2]™ has the 
following possible transitions (cf. (O, ([5])): 



and 



[l,2]r^([l,2], + ir at rate |([1], - [1, 2]0([2], - [1, 2]*) 
[l,2]r^([l,2]t-ir at rate ^[1, 2]t(iV - [1]* - [2]* + [1, 2]*). 



Using the binomial theorem and ehminating empty transitions, we obtain for the m-th moment: 

m — 1 



-E 



dt 



_9i 
' N 



k=Q 
rn— 1 



^E[(™j[l,2]f([l],-[1,2]0([2],-[1,2] 

[l,2],^[l,2],(iV -[!],- [2], + [1,2],) 



k=0 



k=0 



m 
k 



[1, 2],^{[1],[2], - 25(^L Jl, 2],c + 24^Lfe+i[l, 2]? + (-ir-'=[l, 2],iV} 
[l,2]"-i([l],[2]t-[l,2]tiV)' 



E 



with c := ([l]t + [2]t) and 



5(2) 

m — k 



1 if m — k = mod 2 
otherwise. 



So all emerging terms are moments of order m or less. 

4. Recombination and Mutation 

We now want to add mutation to our process. Let us first look at the process with mutation 
alone, e.g. b = Qq = 0. By Lemma [l] the derivative of the mean is: 

[1, ■ • ■,"-]«] = X! ( X! t^yx^^' ---J - 1,J + l,---,n]t- ^ fii^y[l,...,n]ty 



-E 



dt 



So, it only consists of linear terms and marginal processes. When we consider a product of marginal 
processes given by a partition of sites as in the previous section, we have, due to the fact that 
mutation only acts on single sites independently of others: 

|E[n[^.]*]=E[E( n [^»]*E EM^t^AW]*) 

eeM ieM ieM\{£} jeAe yeXj 

-E( n [^^]*E E ^'^.yM 

e<£M ieM\{e} jeAe y(^Xj\{x.} 

What happens when we add recombination? Let Fm and Fj^, respectively, be the 'mean rate of 
change functions' from Lemma[l]for ri^eA/[^^]t from the process with solely mutation and recom- 
bination, respectively. Since mutation and recombination proceed independently, the respective 
function Frm for the recombination-mutation process is then just Fr + Fm and according to 
Lemma [1] we have: 



jAmM.]-^[ E nt^^M-D'-'E E 



kdK GdAj 

G\{A,},^j 



Bk,g 
AN 



[A^\jG\t[A^^j,iJG 



E( n [^^]*E Ea'^.^J^aw: 

f«EM ieM\{t} jeAe yeXj 

E( n [^«]*E E ^'^..[^^]' 

ieM ieM\{l} 3eAeyeXj\{x^} 



MOMENT CLOSURE IN A MORAN MODEL WITH RECOMBINATION 
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So, the arising terms are the same as in the pure recombination process plus hnear terms. It is 
therefore clear that we have moment closure here as well. 

5. RECOMBINATION AND RESAMPLING 

In this section, we set 6 > and the mutation rates zero again, so we look at the Moran model 
with recombination and resampling only. At first glance, one may think that resampling has no 
effect on the expectation, since the process with resampling alone has a constant mean. Indeed, 
the first derivative of the mean looks the same as in the pure recombination case: 

(20) |e[[1,2],] ^|e[[1],[2]*-[1,2],7V]. 

However, due to resampling, the one-site marginal processes are no longer constant, so we do not 
have instantaneous moment closure any more (cf. at a resampling event, the frequency of 

alleles may change). The derivative of their product is obtained after an elementary but lengthy 
calculation: 

(21) |e[[1],[2],] =^E[[l,2],iV-[l],[2],]. 

We obtain a finite linear system of differential equations, namely (pHl) and (PT|) . In particular, we 
obtain 

(22) |e[[1, 2],N - [imt] = -(| + ^)IE[[1, 'AtN ~ [1],[2]*] 

with the obvious exponential solution. The term [\^2]tN — [l]f[2]i is a correlation function, a 
so called linkage disequilibrium, which is widely used in population genetics. We see that both, 
recombination and resampling, reduce correlations between sites. 

For more than two sites, exact moment closure can no longer be established. To make this 
plausible, we will only present the derivative of E[[l]t[2]t[3]t] (again, the calculation is elementary 
but lengthy), which is a term that emerges in the derivatives of the process on three sites due to 
recombination: 

^E[[l]42]43]t] = ^E[[l],[2, i\tN + [2],[1, i\tN + [3]i[l, 2]tN + [l]t[l,2]t[l, 3]* 

+ [2]t[l,2]t[2,3]t + [3]t[l,3]t[2,3]t -3[l]t[2]t[3]t - [l]?[l,2,3]t - [2]?[l,2,3]t - [3]?[1, 2, 3]*] . 

The last three terms are quadratic and it is clear that further differentiating will lead to terms such 
as [l]j[2]([3]t whose derivative will contain moments of even higher order. Thus, the interaction 
between recombination and resampling destroys moment closure. 

6. Conclusion 

In this paper, we have extended the single-crossover Moran model from T to include general re- 
combination. The dynamics of the expectation under general recombination becomes significantly 
more complicated. In particular, it now deviates from the dynamics in the infinite population 
model. The reason is the loss of independence of certain marginal processes. 

As is usual with nonlinear processes, the dynamics of a given moment requires higher moments. 
Nevertheless, in this case after a finite number of steps no additional terms emerge. This is due 
to the fact that the arising processes may in each step be described by a partition of sites. When 
mutation is included, this exact moment closure persists, but the arising processes can no longer 
be described by a partition of sites. Altogether, we have an exception to the rule that the dynamics 
of the moments of nonlinear processes lead to infinite hierarchies of ODE's. 

This exact moment closure gets lost when we extend the model to include genetic drift (i.e., 
resampling). This is, of course, disappointing since the Moran model with recombination alone is 
mathematically interesting, but of limited biological value. Nevertheless, the resulting hierarchy 
of moments might be interesting to analyse with respect to the various possibilities of approximate 
moment closure. 
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Furthermore, the arising terms such as E[ J| [-^i]t] are of considerable interest in population 

genetics beyond this moment closure procedure, since they are the building blocks of the link- 
age disequilibria [6] that are so important in population genetics (compare (|22p for the simplest 
example). 
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